本文涉及使用多项式的有限样品的平滑,高维函数的近似。这项任务是计算科学和工程中许多应用的核心 - 尤其是由参数建模和不确定性量化引起的。通常在此类应用中使用蒙特卡洛(MC)采样,以免屈服于维度的诅咒。但是,众所周知,这种策略在理论上是最佳的。尺寸$ n $有许多多项式空间,样品复杂度尺度划分为$ n $。这种有据可查的现象导致了一致的努力,以设计改进的,实际上是近乎最佳的策略,其样本复杂性是线性的,甚至线性地缩小了$ n $。自相矛盾的是,在这项工作中,我们表明MC实际上是高维度中的一个非常好的策略。我们首先通过几个数值示例记录了这种现象。接下来,我们提出一个理论分析,该分析能够解决这种悖论,以实现无限多变量的全体形态功能。我们表明,基于$ M $ MC样本的最小二乘方案,其错误衰减为$ m/\ log(m)$,其速率与最佳$ n $ term的速率相同多项式近似。该结果是非构造性的,因为它假定了进行近似的合适多项式空间的知识。接下来,我们提出了一个基于压缩感应的方案,该方案达到了相同的速率,除了较大的聚类因子。该方案是实用的,并且在数值上,它的性能和比知名的自适应最小二乘方案的性能和更好。总体而言,我们的发现表明,当尺寸足够高时,MC采样非常适合平滑功能近似。因此,改进的采样策略的好处通常仅限于较低维度的设置。
translated by 谷歌翻译
在Bora等。 (2017年),在测量矩阵为高斯,信号结构是生成神经网络(GNN)的范围的设置中开发了一个数学框架,用于压缩传感保证。此后,当测量矩阵和/或网络权重遵循Subgaussian分布时,对GNNS进行压缩感测的问题进行了广泛的分析。我们超越了高斯的假设,以通过在单一基质的随机行中均匀地采样(包括作为特殊情况下的亚采样傅立叶测量值)来得出的测量矩阵。具体而言,我们证明了使用亚次采样的二型限制感测的第一个已知的限制等轴测保证,并提供了几乎有序的样品复杂性的恢复边界,解决了Scarlett等人的开放问题。 (2022,第10页)。恢复功效的特征是连贯性,这是一个新参数,该参数测量了网络范围与测量矩阵之间的相互作用。我们的方法依赖于子空间计数论点和思想的核心概率。此外,我们提出了一种正规化策略,以使GNN与测量运算符具有有利的连贯性。我们提供令人信服的数值模拟来支持这种正规训练策略:我们的策略产生低相干网络,需要更少的信号回收测量。这与我们的理论结果一起支持连贯性作为自然量,用于表征与亚次采样的生成压缩感测。
translated by 谷歌翻译
高维偏微分方程(PDE)是一种流行的数学建模工具,其应用从财务到计算化学不等。但是,用于解决这些PDE的标准数值技术通常受维度的诅咒影响。在这项工作中,我们应对这一挑战,同时着重于在具有周期性边界条件的高维域上定义的固定扩散方程。受到高维度稀疏功能近似进展的启发,我们提出了一种称为压缩傅立叶搭配的新方法。结合了压缩感应和光谱搭配的想法,我们的方法取代了结构化置式网格用蒙特卡洛采样的使用,并采用了稀疏的恢复技术,例如正交匹配的追踪和$ \ ell^1 $最小化,以近似PDE的傅立叶系数解决方案。我们进行了严格的理论分析,表明所提出的方法的近似误差与最佳$ s $ term近似(相对于傅立叶基础)与解决方案相当。我们的分析使用了最近引入的随机采样框架,我们的分析表明,在足够条件下,根据扩散系数的规律性,压缩傅立叶搭配方法相对于搭配点的数量减轻了维数的诅咒。我们还提出了数值实验,以说明稀疏和可压缩溶液近似方法的准确性和稳定性。
translated by 谷歌翻译
Linguists distinguish between novel and conventional metaphor, a distinction which the metaphor detection task in NLP does not take into account. Instead, metaphoricity is formulated as a property of a token in a sentence, regardless of metaphor type. In this paper, we investigate the limitations of treating conventional metaphors in this way, and advocate for an alternative which we name 'metaphorical polysemy detection' (MPD). In MPD, only conventional metaphoricity is treated, and it is formulated as a property of word senses in a lexicon. We develop the first MPD model, which learns to identify conventional metaphors in the English WordNet. To train it, we present a novel training procedure that combines metaphor detection with word sense disambiguation (WSD). For evaluation, we manually annotate metaphor in two subsets of WordNet. Our model significantly outperforms a strong baseline based on a state-of-the-art metaphor detection model, attaining an ROC-AUC score of .78 (compared to .65) on one of the sets. Additionally, when paired with a WSD model, our approach outperforms a state-of-the-art metaphor detection model at identifying conventional metaphors in text (.659 F1 compared to .626).
translated by 谷歌翻译
A widely acknowledged shortcoming of WordNet is that it lacks a distinction between word meanings which are systematically related (polysemy), and those which are coincidental (homonymy). Several previous works have attempted to fill this gap, by inferring this information using computational methods. We revisit this task, and exploit recent advances in language modelling to synthesise homonymy annotation for Princeton WordNet. Previous approaches treat the problem using clustering methods; by contrast, our method works by linking WordNet to the Oxford English Dictionary, which contains the information we need. To perform this alignment, we pair definitions based on their proximity in an embedding space produced by a Transformer model. Despite the simplicity of this approach, our best model attains an F1 of .97 on an evaluation set that we annotate. The outcome of our work is a high-quality homonymy annotation layer for Princeton WordNet, which we release.
translated by 谷歌翻译
Binarized Neural Networks (BNNs) are receiving increasing attention due to their lightweight architecture and ability to run on low-power devices. The state-of-the-art for training classification BNNs restricted to few-shot learning is based on a Mixed Integer Programming (MIP) approach. This paper proposes the BeMi ensemble, a structured architecture of BNNs based on training a single BNN for each possible pair of classes and applying a majority voting scheme to predict the final output. The training of a single BNN discriminating between two classes is achieved by a MIP model that optimizes a lexicographic multi-objective function according to robustness and simplicity principles. This approach results in training networks whose output is not affected by small perturbations on the input and whose number of active weights is as small as possible, while good accuracy is preserved. We computationally validate our model using the MNIST and Fashion-MNIST datasets using up to 40 training images per class. Our structured ensemble outperforms both BNNs trained by stochastic gradient descent and state-of-the-art MIP-based approaches. While the previous approaches achieve an average accuracy of 51.1% on the MNIST dataset, the BeMi ensemble achieves an average accuracy of 61.7% when trained with 10 images per class and 76.4% when trained with 40 images per class.
translated by 谷歌翻译
One of the common traits of past and present approaches for Semantic Role Labeling (SRL) is that they rely upon discrete labels drawn from a predefined linguistic inventory to classify predicate senses and their arguments. However, we argue this need not be the case. In this paper, we present an approach that leverages Definition Modeling to introduce a generalized formulation of SRL as the task of describing predicate-argument structures using natural language definitions instead of discrete labels. Our novel formulation takes a first step towards placing interpretability and flexibility foremost, and yet our experiments and analyses on PropBank-style and FrameNet-style, dependency-based and span-based SRL also demonstrate that a flexible model with an interpretable output does not necessarily come at the expense of performance. We release our software for research purposes at https://github.com/SapienzaNLP/dsrl.
translated by 谷歌翻译
In this new computing paradigm, named quantum computing, researchers from all over the world are taking their first steps in designing quantum circuits for image processing, through a difficult process of knowledge transfer. This effort is named Quantum Image Processing, an emerging research field pushed by powerful parallel computing capabilities of quantum computers. This work goes in this direction and proposes the challenging development of a powerful method of image denoising, such as the Total Variation (TV) model, in a quantum environment. The proposed Quantum TV is described and its sub-components are analysed. Despite the natural limitations of the current capabilities of quantum devices, the experimental results show a competitive denoising performance compared to the classical variational TV counterpart.
translated by 谷歌翻译
In this paper, we introduced the novel concept of advisor network to address the problem of noisy labels in image classification. Deep neural networks (DNN) are prone to performance reduction and overfitting problems on training data with noisy annotations. Weighting loss methods aim to mitigate the influence of noisy labels during the training, completely removing their contribution. This discarding process prevents DNNs from learning wrong associations between images and their correct labels but reduces the amount of data used, especially when most of the samples have noisy labels. Differently, our method weighs the feature extracted directly from the classifier without altering the loss value of each data. The advisor helps to focus only on some part of the information present in mislabeled examples, allowing the classifier to leverage that data as well. We trained it with a meta-learning strategy so that it can adapt throughout the training of the main model. We tested our method on CIFAR10 and CIFAR100 with synthetic noise, and on Clothing1M which contains real-world noise, reporting state-of-the-art results.
translated by 谷歌翻译
In this paper, we present PARTIME, a software library written in Python and based on PyTorch, designed specifically to speed up neural networks whenever data is continuously streamed over time, for both learning and inference. Existing libraries are designed to exploit data-level parallelism, assuming that samples are batched, a condition that is not naturally met in applications that are based on streamed data. Differently, PARTIME starts processing each data sample at the time in which it becomes available from the stream. PARTIME wraps the code that implements a feed-forward multi-layer network and it distributes the layer-wise processing among multiple devices, such as Graphics Processing Units (GPUs). Thanks to its pipeline-based computational scheme, PARTIME allows the devices to perform computations in parallel. At inference time this results in scaling capabilities that are theoretically linear with respect to the number of devices. During the learning stage, PARTIME can leverage the non-i.i.d. nature of the streamed data with samples that are smoothly evolving over time for efficient gradient computations. Experiments are performed in order to empirically compare PARTIME with classic non-parallel neural computations in online learning, distributing operations on up to 8 NVIDIA GPUs, showing significant speedups that are almost linear in the number of devices, mitigating the impact of the data transfer overhead.
translated by 谷歌翻译